AITopics | attention mask

Collaborating Authors

attention mask

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Adaptive parallel reasoning: the next paradigm in efficient inference scaling

AIHubJul-2-2026, 08:44:14 GMT

What if a reasoning model could decide when to decompose and parallelize independent subtasks, how many concurrent threads to spawn, and how to coordinate them based on the problem at hand? We provide a detailed analysis of recent progress in the field of parallel reasoning, especially adaptive parallel reasoning. Disclosure: this post is part landscape survey, part perspective on adaptive parallel reasoning. One of the authors (Tony Lian) co-led ThreadWeaver ( Lian et al., 2025), one of the methods discussed below. The authors aim to present each approach on its own terms. Recent progress in LLM reasoning capabilities has been largely driven by inference-time scaling, in addition to data and parameter scaling ( OpenAI et al., 2024; DeepSeek-AI et al., 2025). Models that explicitly output reasoning tokens (through intermediate steps, backtracking, and exploration) now dominate math, coding, and agentic benchmarks.

large language model, machine learning, natural language, (19 more...)

AIHub

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.54)

Add feedback

Unifying and Enhancing Graph Transformers via a Hierarchical Mask Framework

Neural Information Processing SystemsJun-22-2026, 18:02:54 GMT

Graph Transformers (GTs) have emerged as a powerful paradigm for graph representation learning due to their ability to model diverse node interactions. However, existing GTs often rely on intricate architectural designs tailored to specific interactions, limiting their flexibly. To address this, we propose a unified hierarchical mask framework that reveals an underlying equivalence between model architecture and attention mask construction. This framework enables a consistent modeling paradigm by capturing diverse interactions through carefully designed attention masks. Theoretical analysis under this framework demonstrates that the probability of correct classification positively correlates with the receptive field size and label consistency, leading to a fundamental design principle: An effective attention mask should ensure both a sufficiently large receptive field and a high level of label consistency.

artificial intelligence, machine learning, natural language, (18 more...)

Neural Information Processing Systems

Genre:

Research Report > Experimental Study (1.00)
Research Report > New Finding (0.93)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

LaViDa: ALarge Diffusion Language Model for Multimodal Understanding

Neural Information Processing SystemsJun-20-2026, 04:11:40 GMT

Modern Vision-Language Models (VLMs) can solve a wide range of tasks requiring visual reasoning. In real-world scenarios, desirable properties for VLMs include fast inference and controllable generation (e.g., constraining outputs to adhere to a desired format).

arxiv preprint arxiv, large language model, machine learning, (20 more...)

Neural Information Processing Systems

Country: Europe (0.67)

Genre:

Research Report > Experimental Study (1.00)
Research Report > New Finding (0.92)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
(2 more...)

Add feedback

Visual Question Answering with Question Representation Update (QRU)

Ruiyu Li, Jiaya Jia

Neural Information Processing SystemsApr-22-2026, 14:34:25 GMT

Neural Information Processing Systems http://nips.cc/

artificial intelligence, machine learning, natural language, (19 more...)

Neural Information Processing Systems

Country: Asia > China (0.14)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
(2 more...)

Add feedback

Order-Independence Without Fine Tuning

Reid McIlroy-Young, Katrina Brown, Conlan Olson, Linjun Zhang, Cynthia Dwork

Neural Information Processing SystemsFeb-16-2026, 07:50:54 GMT

Unlike humans, these ' Large Language Models ' (LLMs) are highly sensitive to small changes in their inputs, leading to unwanted inconsistency in their behavior.

large language model, machine learning, natural language, (17 more...)

Neural Information Processing Systems

Country: North America > United States > Maryland > Baltimore (0.04)

Genre:

Research Report > New Finding (0.46)
Research Report > Experimental Study (0.46)

Industry:

Law Enforcement & Public Safety > Crime Prevention & Enforcement (0.46)
Government > Regional Government (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.71)

Add feedback

Flow-based Feature Fusion for Vehicle-Infrastructure Cooperative 3D Object Detection - Appendix Haibao Y u 1, 2, Yingjuan T ang

Neural Information Processing SystemsFeb-13-2026, 16:56:23 GMT

Mean A verage Precision (mAP). For VIC3D object detection, we focus on the obstacles around the ego vehicle. There are two metrics used for evaluation: BEV@mAP and 3D@mAP . BEV@mAP evaluates the 3D boxes in the bird's-eye view and ignores the In our implementation, we ignore the transmission cost of calibration files and timestamps. For early fusion, we calculate the transmission cost of transmitting raw data.

artificial intelligence, machine learning, transmission cost, (16 more...)

Neural Information Processing Systems

Country:

Europe > Netherlands > North Holland > Amsterdam (0.04)
Asia > China > Shanghai > Shanghai (0.04)
Asia > China > Hong Kong (0.04)
Asia > China > Beijing > Beijing (0.04)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Vision (0.88)

Add feedback

9c93b3cd3bc60c0fe7b0c2d74a2da966-Paper-Conference.pdf

Neural Information Processing SystemsFeb-11-2026, 00:37:16 GMT

attention head, full attention, sbm-transformer, (13 more...)

Neural Information Processing Systems

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
North America > United States > Texas > Travis County > Austin (0.04)
North America > United States > Oregon > Multnomah County > Portland (0.04)
(4 more...)

Industry: Media (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Sensing and Signal Processing > Image Processing (0.68)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.68)
(2 more...)

Add feedback

c2964caac096f26db222cb325aa267cb-Paper.pdf

Neural Information Processing SystemsFeb-10-2026, 04:06:20 GMT

continual learning, proceedings, ratt, (11 more...)

Neural Information Processing Systems

Country:

Europe > Spain > Catalonia > Barcelona Province > Barcelona (0.04)
Europe > Italy (0.04)
North America > United States > Texas > Dallas County > Dallas (0.04)
(3 more...)

Genre:

Workflow (0.46)
Research Report (0.46)

Industry: Education > Educational Setting > Continuing Education (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)

Add feedback

1ac3030fc57850b0fb11dfe9d4880ad7-Paper-Conference.pdf

Neural Information Processing SystemsFeb-8-2026, 19:17:17 GMT

layernorm, rank collapse, transformer, (17 more...)

Neural Information Processing Systems

Country:

South America > Chile > Santiago Metropolitan Region > Santiago Province > Santiago (0.04)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
Europe > Germany > Bavaria > Upper Bavaria > Munich (0.04)
Asia > Middle East > Jordan (0.04)

Genre:

Research Report > Experimental Study (1.00)
Research Report > New Finding (0.93)

Industry: Government > Regional Government (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Vision (0.67)

Add feedback

Enhancing low energy reconstruction and classification in KM3NeT/ORCA with transformers

Mateo, Iván Mozún

arXiv.org Artificial IntelligenceNov-25-2025

The current KM3NeT/ORCA neutrino telescope, still under construction, has not yet reached its full potential in neutrino reconstruction capability . When training any deep learning model, no explicit information about the physics or the detector is provided, thus they remain unknown to the model. This study leverages the strengths of transformers by incorporating attention masks inspired by the physics and detector design, making the model understand both the telescope design and the neutrino physics measured on it. The study also shows the efficacy of transformers on retaining valuable information between detectors when doing fine-tuning from one configurations to another .

artificial intelligence, information, machine learning, (12 more...)

arXiv.org Artificial Intelligence

2511.18999

Country: Europe > Italy > Sardinia (0.15)

Genre: Research Report > New Finding (0.69)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.59)

Add feedback